9 research outputs found
MADiff: Offline Multi-agent Learning with Diffusion Models
Diffusion model (DM), as a powerful generative model, recently achieved huge
success in various scenarios including offline reinforcement learning, where
the policy learns to conduct planning by generating trajectory in the online
evaluation. However, despite the effectiveness shown for single-agent learning,
it remains unclear how DMs can operate in multi-agent problems, where agents
can hardly complete teamwork without good coordination by independently
modeling each agent's trajectories. In this paper, we propose MADiff, a novel
generative multi-agent learning framework to tackle this problem. MADiff is
realized with an attention-based diffusion model to model the complex
coordination among behaviors of multiple diffusion agents. To the best of our
knowledge, MADiff is the first diffusion-based multi-agent offline RL
framework, which behaves as both a decentralized policy and a centralized
controller, which includes opponent modeling and can be used for multi-agent
trajectory prediction. MADiff takes advantage of the powerful generative
ability of diffusion while well-suited in modeling complex multi-agent
interactions. Our experiments show the superior performance of MADiff compared
to baseline algorithms in a range of multi-agent learning tasks.Comment: 17 pages, 7 figures, 4 table
An Improved Genetic-Shuffled Frog-Leaping Algorithm for Permutation Flowshop Scheduling
Due to the NP-hard nature, the permutation flowshop scheduling problem (PFSSP) is a fundamental issue for Industry 4.0, especially under higher productivity, efficiency, and self-managing systems. This paper proposes an improved genetic-shuffled frog-leaping algorithm (IGSFLA) to solve the permutation flowshop scheduling problem. In the proposed IGSFLA, the optimal initial frog (individual) in the initialized group is generated according to the heuristic optimal-insert method with fitness constrain. The crossover mechanism is applied to both the subgroup and the global group to avoid the local optimal solutions and accelerate the evolution. To evolve the frogs with the same optimal fitness more outstanding, the disturbance mechanism is applied to obtain the optimal frog of the whole group at the initialization step and the optimal frog of the subgroup at the searching step. The mathematical model of PFSSP is established with the minimum production cycle (makespan) as the objective function, the fitness of frog is given, and the IGSFLA-based PFSSP is proposed. Experimental results have been given and analyzed, showing that IGSFLA not only provides the optimal scheduling performance but also converges effectively